Google Search

Google Search

Google Search homepage
URL Google.com
Commercial? Yes
Type of site Web search engine
Registration Optional
Available language(s) Multilingual (124)
Owner Google
Created by Sergey Brin and Larry Page
Launched September 15, 1997 (1997-09-15)[1]
Alexa rank 1 (January 2012)[2]
Revenue From AdWords
Current status Active

Google or Google Web Search is a web search engine owned by Google Inc. Google Search is the most-used search engine on the World Wide Web,[3] receiving several hundred million queries each day through its various services.[4]

The order of search results on Google's search-results pages is based, in part, on a priority rank called a "PageRank". Google Search provides many options for customized search, using Boolean operators such as: exclusion ("-xx"), alternatives ("xx OR yy"), and wildcard ("x * x").[5]

The main purpose of Google Search is to hunt for text in Web pages, as opposed to other data, such as with Google Image Search. Google Search was originally developed by Larry Page and Sergey Brin in 1997.[6] Google Search provides at least 22 special features beyond the original word-search capability.[7] These include synonyms, weather forecasts, time zones, stock quotes, maps, earthquake data, movie showtimes, airports, home listings, and sports scores. There are special features for numbers, including ranges (70..73),[8] prices, temperatures, money/unit conversions ("10.5 cm in inches"), calculations ("3*4+sqrt(6)-pi/2"), package tracking, patents, area codes,[7] and language translation of displayed pages. In June 2011, Google introduced "Google Voice Search" and "Search by Image" features for allowing the users to search words by speaking and by giving images.[9]

The frequency of use of many search terms have reached a volume that they may indicate broader economic, social and health trends.[10] Data about the frequency of use of search terms on Google (available through Google Adwords, Google Trends, and Google Insights for Search) have been shown to correlate with flu outbreaks and unemployment levels and provide the information faster than traditional reporting methods and government surveys.

Contents

Search engine

PageRank

Google's rise to success was in large part due to a patented algorithm called PageRank that helps rank web pages that match a given search string.Brin, S.; Page, L. (1998). "The anatomy of a large-scale hypertextual Web search engine". Computer Networks and ISDN Systems 30: 107–117. doi:10.1016/S0169-7552(98)00110-X. ISSN 0169-7552. http://infolab.stanford.edu/pub/papers/google.pdf.  edit When Google was a Stanford research project, it was nicknamed BackRub because the technology checks backlinks to determine a site's importance. Previous keyword-based methods of ranking search results, used by many search engines that were once more popular than Google, would rank pages by how often the search terms occurred in the page, or how strongly associated the search terms were within each resulting page. The PageRank algorithm instead analyzes human-generated links assuming that web pages linked from many important pages are themselves likely to be important. The algorithm computes a recursive score for pages, based on the weighted sum of the PageRanks of the pages linking to them. PageRank is thought to correlate well with human concepts of importance. In addition to PageRank, Google, over the years, has added many other secret criteria for determining the ranking of pages on result lists, reported to be over 200 different indicators."Corporate Information: Technology Overview". Google. http://www.google.com/corporate/tech.html. Retrieved 2009-11-15. Wired.com The specifics of which are kept secret to keep spammers at bay and help Google maintain an edge over its competitors globally.

Search results

The exact percentage of the total of web pages that Google indexes is not known, as it is very difficult to accurately calculate. Google not only indexes and caches web pages, but also takes "snapshots" of other file types, which include PDF, Word documents, Excel spreadsheets, Flash SWF, plain text files, and so on."Google Frequently Asked Questions - File Types". Google. http://www.google.com/help/faq_filetypes.html#what. Retrieved 2011-09-12. Except in the case of text and SWF files, the cached version is a conversion to (X)HTML, allowing those without the corresponding viewer application to read the file. Users can customize the search engine, by setting a default language, using the "SafeSearch" filtering technology and set the number of results shown on each page. Google has been criticized for placing long-term cookies on users' machines to store these preferences, a tactic which also enables them to track a user's search terms and retain the data for more than a year. For any query, up to the first 1000 results can be shown with a maximum of 100 displayed per page. The ability to specify the number of results is available only if "Instant Search" is not enabled. If "Instant Search" is enabled, only 10 results are displayed, regardless of this setting.

Non-indexable data

Despite its immense index, there is also a considerable amount of data available in online databases which are accessible by means of queries but not by links. This so-called invisible or deep Web is minimally covered by Google and other search engines.Sherman, Chris and Price, Gary. "The Invisible Web: Uncovering Sources Search Engines Can't See, In: Library Trends 52 (2) 2003: Organizing the Internet:". pp. 282–298. http://hdl.handle.net/2142/8528.  The deep Web contains library catalogs, official legislative documents of governments, phone books, and other content which is dynamically prepared to respond to a query.

Google optimization

Since Google is the most popular search engine, many webmasters have become eager to influence their website's Google rankings. An industry of consultants has arisen to help websites increase their rankings on Google and on other search engines. This field, called search engine optimization, attempts to discern patterns in search engine listings, and then develop a methodology for improving rankings to draw more searchers to their client's sites. Search engine optimization encompasses both "on page" factors (like body copy, title elements, H1 heading elements and image alt attribute values) and Off Page Optimization factors (like anchor text and PageRank). The general idea is to affect Google's relevance algorithm by incorporating the keywords being targeted in various places "on page", in particular the title element and the body copy (note: the higher up in the page, presumably the better its keyword prominence and thus the ranking). Too many occurrences of the keyword, however, cause the page to look suspect to Google's spam checking algorithms. Google has published guidelines for website owners who would like to raise their rankings when using legitimate optimization consultants."Google Webmaster Guidelines". Google. http://www.google.com/webmasters/guidelines.html. Retrieved 2009-11-15.  It has been hypothesized, and, allegedly, is the opinion of the owner of one business about which there has been numerous complaints, that negative publicity, for example, numerous consumer complaints, may serve as well to elevate page rank on Google Search as favorable comments.Segal, David (November 26, 2010). "A Bully Finds a Pulpit on the Web". The New York Times. https://www.nytimes.com/2010/11/28/business/28borker.html. Retrieved November 27, 2010.  The particular problem addressed in The New York Times article, which involved DecorMyEyes, was addressed shortly thereafter by an undisclosed fix in the Google algorithm. According to Google, it was not the frequently published consumer complaints about DecorMyEyes which resulted in the high ranking but mentions on news websites of events which affected the firm such as legal actions against it.Blogspot.com

Functionality

Google search consists of a series of localized websites. The largest of those, the google.com site, is the top most-visited website in the world.[11] Some of its features include a definition link for most searches including dictionary words, the number of results you got on your search, links to other searches (e.g. for words that Google believes to be misspelled, it provides a link to the search results using its proposed spelling), and many more.

Search syntax

Google's search engine normally accepts queries as a simple text, and breaks up the user's text into a sequence of search terms, which will usually be words that are to occur in the results, but one can also use Boolean operators, such as: quotations marks (") for a phrase, a prefix such as "+" , "-" for qualified terms (no longer valid, the '+' was removed from google on 10/19/11[12]), or one of several advanced operators, such as "site:". The webpages of "Google Search Basics"[13] describe each of these additional queries and options (see below: Search options). Google's Advanced Search web form gives several additional fields which may be used to qualify searches by such criteria as date of first retrieval. All advanced queries transform to regular queries, usually with additional qualified term.

Query expansion

Google applies query expansion to the submitted search query, transforming it into the query that will actually be used to retrieve results. As with page ranking, the exact details of the algorithm Google uses are deliberately obscure, but certainly the following transformations are among those that occur:

"I'm Feeling Lucky"

Google's homepage includes a button labelled "I'm Feeling Lucky". When a user types in a search and clicks on the button the user will be taken directly to the first search result, bypassing the search engine results page. The thought is that if a user is "feeling lucky", the search engine will return the perfect match the first time without having to page through the search results. However, with the introduction of Google Instant, it is not possible to use the button properly unless the Google Instant function is switched off. According to a study by Tom Chavez of "Rapt", this feature costs Google $110 million a year as 1% of all searches use this feature and bypass all advertising.[15]

On October 30, 2009, for some users, the "I'm Feeling Lucky" button was removed from Google's main page, along with the regular search button. Both buttons were replaced with a field that reads, "This space intentionally left blank." This text faded out when the mouse was moved on the page, and normal search functionality is achieved by filling in the search field with the desired terms and pressing enter. A Google spokesperson explains, "This is just a test, and a way for us to gauge whether our users will like an even simpler search interface."[16] Personalized Google homepages retained both buttons and their normal functions.

On May 21, 2010, the 30th anniversary of Pac-Man, the "I'm Feeling Lucky" button was replaced with a button reading the words "Insert Coin". After pressing the button, the user would begin a Google-themed game of Pac-Man in the area where the Google logo would normally be. Pressing the button a second time would begin a two-player version of the same game that includes Ms. Pacman for player 2. This version can be accessed at www.google.com/pacman/[17] as a permanent link to the page.

Rich Snippets

On 12 May 2009, Google announced that they would be parsing the hCard, hReview, and hProduct microformats and using them to populate search result pages with what they called "Rich Snippets".[18]

Special features

Besides the main search-engine feature of searching for text, Google Search has more than 22 "special features" (activated by entering any of dozens of trigger words) when searching:[7][8][19]

Search options

The webpages maintained by the Google Help Center have text describing more than 15 various search options.[21] The Google operators:

Some of the query options are as follows:

The page-display options (or query types) are:

Error messages

Some searches will give a 403 Forbidden error with the text

"We're sorry... ... but your query looks similar to automated requests from a computer virus or spyware application. To protect our users, we can't process your request right now. We'll restore your access as quickly as possible, so try again soon. In the meantime, if you suspect that your computer or network has been infected, you might want to run a virus checker or spyware remover to make sure that your systems are free of viruses and other spurious software. We apologize for the inconvenience, and hope we'll see you again on Google."

sometimes followed by a CAPTCHA prompt.[22]

The screen was first reported in 2005, and was a response to the heavy use of Google by search engine optimization companies to check on ranks of sites they were optimizing. The message is triggered by high volumes of requests from a single IP address. Google apparently uses the Google cookie as part of its determination of refusing service.[22]

In June 2009, after the death of pop superstar Michael Jackson, this message appeared to many internet users who were searching Google for news stories related to the singer, and was assumed by Google to be a DDoS attack, although many queries were submitted by legitimate searchers.

January 2009 malware bug

Google flags search results with the message "This site may harm your computer" if the site is known to install malicious software in the background or otherwise surreptitiously. Google does this to protect users against visiting sites that could harm their computers. For approximately 40 minutes on January 31, 2009, all search results were mistakenly classified as malware and could therefore not be clicked; instead a warning message was displayed and the user was required to enter the requested URL manually. The bug was caused by human error.[23][24][25][26] The URL of "/" (which expands to all URLs) was mistakenly added to the malware patterns file.[24][25]

Google Doodles

On certain occasions, the logo on Google's webpage will change to a special version, known as a "Google Doodle". Clicking on the Doodle links to a string of Google search results about the topic. The first was a reference to the Burning Man Festival in 1998,[27][28] and others have been produced for the birthdays of notable people like Albert Einstein, historical events like the interlocking Lego block's 50th anniversary and holidays like Valentine's Day.[29] Some Google Doodles have interactivity beyond a simple search, such as the famous "Google Pacman" version that appeared on May 21, 2010.

Google Caffeine

In August 2009, Google announced the rollout of a new search architecture, codenamed "Caffeine".[30] The new architecture was designed to return results faster and to better deal with rapidly updated information[31] from services including Facebook and Twitter.[30] Google developers noted that most users would notice little immediate change, but invited developers to test the new search in its sandbox.[32] Differences noted for their impact upon search engine optimization included heavier keyword weighting and the importance of the domain's age.[33][34] The move was interpreted in some quarters as a response to Microsoft's recent release of an upgraded version of its own search service, renamed Bing.[35] Google announced completion of Caffeine on 8 June 2010, claiming 50% fresher results due to continuous updating of its index.[36] With Caffeine, Google moved its back-end indexing system away from MapReduce and onto BigTable, the company's distributed database platform.[37] Caffeine is also based on Colossus, or GFS2,[38] an overhaul of the GFS distributed file system.[39]

Encrypted Search

In May 2010 Google rolled out SSL-encrypted web search.[40] The encrypted search can be accessed at encrypted.google.com[41]

Instant Search

Google Instant, a feature that displays suggested results while the user types, was introduced in the United States on September 8, 2010. Google expects Google Instant to save users 2 to 5 seconds in every search, which they say will be collectively 11 million seconds per hour.[42] Search engine marketing pundits speculate that Google Instant will have a great impact on local and paid search.[43]

In concert with the Google Instant launch, Google disabled the ability of users to choose to see more than 10 search results per page. Instant Search can be disabled via Google's "preferences" menu, but autocomplete-style search suggestions now cannot be disabled. A Google representative stated, "It's in keeping with our vision of a unified Google search experience to make popular, useful features part of the default experience, rather than maintain different versions of Google. As Autocomplete quality has improved, we felt it was appropriate to have it always on for all of our users."[44]

The publication 2600: The Hacker Quarterly has compiled a list of words that are restricted by Google Instant.[45] These are terms the web giant's new instant search feature will not search.[46][47] Most terms are often vulgar and derogatory in nature, but some apparently irrelevant searches including "Myleak" are removed.[47]

Redesign

In late June 2011, Google introduced a new look to the Google home page in order to boost the use of the Google+ social tools.[48]

One of the major changes was replacing the classic navigation bar with a black one. Google's digital creative director Chris Wiggins explains: "We're working on a project to bring you a new and improved Google experience, and over the next few months, you'll continue to see more updates to our look and feel."[49] The new navigation bar has been negatively received by a vocal minority.[50]

International

Google is available in many languages and has been localized completely or partly for many countries.[51]

The interface has also been made available in some languages for humorous purpose:

In addition to the main URL Google.com, Google Inc. owns 160 domain names for each of the countries/regions in which it has been localized.[51]

Search products

In addition to its tool for searching webpages, Google also provides services for searching images, Usenet newsgroups, news websites, videos, searching by locality, maps, and items for sale online. In 2006, Google has indexed over 25 billion web pages,[52] 400 million queries per day,[52] 1.3 billion images, and over one billion Usenet messages. It also caches much of the content that it indexes. Google operates other tools and services including Google News, Google Suggest, Google Product Search, Google Maps, Google Co-op, Google Earth, Google Docs, Picasa, Panoramio, YouTube, Google Translate, Google Blog Search and Google Desktop Search.

There are also products available from Google that are not directly search-related. Gmail, for example, is a webmail application, but still includes search features; Google Browser Sync does not offer any search facilities, although it aims to organize your browsing time.

Also Google starts many new beta products, like Google Social Search or Google Image Swirl.

Energy consumption

Google claims that a search query requires altogether about 1 kJ or 0.0003 kW·h.[53]

See also

References

  1. ^ "WHOIS - google.com". http://reports.internic.net/cgi/whois?whois_nic=google.com&type=domain. Retrieved 2009-01-27. 
  2. ^ "Google.com Site Info". Alexa Internet. http://www.alexa.com/siteinfo/google.com. Retrieved 2012-01-02. 
  3. ^ "Alexa Search Engine ranking". http://www.alexa.com/siteinfo/google.com+yahoo.com+altavista.com. Retrieved 2009-11-15. 
  4. ^ "Almost 12 Billion U.S. Searches Conducted in July". SearchEngineWatch. 2008-09-02. http://searchenginewatch.com/showPage.html?page=3630718. 
  5. ^ ...The *, or wildcard, is a little-known feature that can be very powerful...
  6. ^ "WHOIS - google.com". http://reports.internic.net/cgi/whois?whois_nic=google.com&type=domain. Retrieved 2009-01-27. 
  7. ^ a b c d e f g h i j k l m n o p q r s t "Search Features". Google.com. May 2009. http://www.google.com/intl/en/help/features.html. 
  8. ^ a b c d "Google Help : Cheat Sheet". Google. 2010. http://www.google.com/help/cheatsheet.html. 
  9. ^ Voice Search for Google.com - Just click the mic and say your search. And, Search Google by giving Image
  10. ^ Hubbard, Douglas (2011). Pulse: The New Science of Harnessing Internet Buzz to Track Threats and Opportunities. John Wiley & Sons. 
  11. ^ "Top 500". Alexa. http://www.alexa.com/site/ds/top_sites?ts_mode=global&lang=none. Retrieved 2008-04-15. 
  12. ^ a b [1], Google changes the operators.
  13. ^ Google.com
  14. ^ "Google:Stemming". Google. http://www.google.com/support/bin/answer.py?answer=35889#stemming. 
  15. ^ "I'm feeling lucky( button costs Google $110 million per year". Valleywag. 2007. http://valleywag.com/tech/google/im-feeling-lucky-button-costs-google-110-million-per-year-324927.php. Retrieved 2008-01-19. 
  16. ^ "Google’s New Homepage Motto: 'This Space Intentionally Left Blank'". WallStreetJournal. 2009. http://digitaldaily.allthingsd.com/20091030/goog-page/. Retrieved 2009-11-17. 
  17. ^ Google.com
  18. ^ Goel, Kavi; Ramanathan V. Guha, Othar Hansson (2009-05-12). "Introducing Rich Snippets". Google Webmaster Central Blog. Google. http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html. Retrieved 2009-05-25. 
  19. ^ a b "Google and Search Engines". Emory University Law School. 2006. http://www.law.emory.edu/law-library/research/advanced-legal-research-class/finding-aids-and-searching/google.html. 
  20. ^ Google.com
  21. ^ a b c d e f g h "Google Help Center – Alternate query types", 2009, webpage: G-help.
  22. ^ a b "Google error page". http://www.google.com/support/bin/answer.py?answer=15661. Retrieved 2008-12-31. 
  23. ^ Krebs, Brian (2009-01-31). "Google: This Internet May Harm Your Computer". The Washington Post. http://voices.washingtonpost.com/securityfix/2009/01/google_this_internet_will_harm.html?hpid=news-col-blog. Retrieved 2009-01-31. 
  24. ^ a b Mayer, Marissa (2009-01-31). "This site may harm your computer on every search result?!?!". The Official Google Blog. Google. http://googleblog.blogspot.com/2009/01/this-site-may-harm-your-computer-on.html. Retrieved 2009-01-31. 
  25. ^ a b Weinstein, Maxim (2009-1-31). "Google glitch causes confusion". StopBadware.org. http://blog.stopbadware.org/2009/01/31/google-glitch-causes-confusion. Retrieved 2010-5-10. 
  26. ^ Cooper, Russ (January 31, 2009). "Serious problems with Google search". Verizon Business Security Blog. http://securityblog.verizonbusiness.com/2009/01/31/serious-problems-with-google-search/. Retrieved 2010-5-10. 
  27. ^ Hwang, Dennis (June 8, 2004). "Oodles of Doodles". Google (corporate blog). http://googleblog.blogspot.com/2004/06/oodles-of-doodles.html. Retrieved July 19, 2006. 
  28. ^ "Doodle History". Google, Inc.. http://www.google.com/doodle4google/history.html. Retrieved 5-10-2010. 
  29. ^ "Google logos:Valentine's Day logo". February 14, 2007. http://www.google.com/logos/valentine07.gif. Retrieved April 6, 2007. 
  30. ^ a b Harvey, Mike (11 August 2009). "Google unveils new "Caffeine" search engine". London: The Times. http://technology.timesonline.co.uk/tol/news/tech_and_web/personal_tech/article6792403.ece. Retrieved 14 August 2009. 
  31. ^ "What Does Google "Caffeine" Mean for My Website?". New York: Siivo Corp. 21 July 2010. http://www.siivo.com/blog/2010/07/what-does-google-caffeine-mean-for-my-website. Retrieved 21 July 2010. 
  32. ^ Culp, Katie (12 August 2009). "Google introduces new "Caffeine" search system". Fox News. http://www.foxbusiness.com/story/markets/industries/technology/google-introduces-new-caffeine-search/. Retrieved 14 August 2009. 
  33. ^ Martin, Paul (31 July 2009). "Bing - The new Search Engine from Microsoft and Yahoo". Cube3 Marketing. http://blog.cube3marketing.com/2009/07/31/bing-the-new-search-engine-from-microsoft-and-yahoo/. Retrieved 12 January 2010. 
  34. ^ Martin, Paul (27 August 2009). "Caffeine - The New Google Update". Cube3 Marketing. http://blog.cube3marketing.com/2009/08/27/caffeine-the-new-google-update/. Retrieved 12 January 2010. 
  35. ^ Barnett, Emma (11 August 2009). "Google reveals caffeine: a new faster search engine". The Telegraph. http://www.telegraph.co.uk/technology/google/6009176/Google-reveals-caffeine-a-new-faster-search-engine.html. Retrieved 14 August 2009. 
  36. ^ Grimes, Carrie (8 June 2010). "Our new search index: Caffeine". The Official Google Blog. http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html. Retrieved 18 June 2010. 
  37. ^ Google search index splits with MapReduce – The Register
  38. ^ Google Caffeine: What it really is – The Register
  39. ^ Google File System II: Dawn of the Multiplying Master Nodes – The Register
  40. ^ "SSL Search: Features - Web Search Help". Web Search Help. Google. May 2010. http://www.google.com/support/websearch/bin/answer.py?answer=173733&hl=en. Retrieved 2010-07-07. 
  41. ^ Encrypted.google.com
  42. ^ Peter Nowak (2010). Tech Bytes: Google Instant (Television production). United States: ABC News. 
  43. ^ "How Google Saved $100 Million By Launching Google Instant". http://searchengineland.com/how-google-saved-100-million-by-launching-google-instant-51270. Retrieved 20 September 2010. 
  44. ^ Google Web Search Help Forum (WebCite archive)
  45. ^ 2600.com: Google Blacklist - Words That Google Instant Doesn't Like
  46. ^ CNN: Which words does Google Instant blacklist?
  47. ^ a b The Huffington Post: Google Instant Censorship: The Strangest Terms Blacklisted By Google
  48. ^ Boulton, Clint. "Google Redesign Backs Social Effort". eWeek Europe. eWeek Europe. http://www.eweekeurope.co.uk/comment/google-redesign-backs-social-effort-32954. Retrieved 1 July 2011. 
  49. ^ Google redesigns its homepage Los Angeles Times
  50. ^ Google support forum, one of many threads on being unable to switch off the black navigation bar
  51. ^ a b Language Tools
  52. ^ a b Google, Web Crawling and Distributed Synchronization p. 11.
  53. ^ Blogspot.com, Powering a Google search

Further reading

External links